Content
- Reproductibility Research ?
- Idée maîtresse
- Strucuture d’un projet R (.Rproj)
- Le markdown
- Live coding, tidyverse and ethnobotanyR
- Version Control
- Stockage les données.
Yamoussoukro based fullstack ;) Chercheur, Conférencier, Animateur Culturel, R user.
Twitter: [@ehoumanevans](https://twitter.com/ehoumanevans)
My Biodata sur Netlify
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] revealjs_0.9 digest_0.6.29 R6_2.5.1 jsonlite_1.7.2
## [5] magrittr_2.0.1 evaluate_0.14 stringi_1.7.5 rlang_0.4.12
## [9] jquerylib_0.1.4 bslib_0.3.1 rmarkdown_2.11 tools_4.1.1
## [13] stringr_1.4.0 xfun_0.28 yaml_2.2.1 fastmap_1.1.0
## [17] compiler_4.1.1 htmltools_0.5.2 knitr_1.36 sass_0.4.0
Recherche reproductible : Les auteurs fournissent les données et le code nécessaire pour rejouer les analyses et recréer les résultats numériques.
Replication : Une étude arrive aux même résultats scientifiques en collectant de nouvelles données (éventuellement avec des méthodes différentes) et en réalisant une nouvelle analyse.
Attention, d’autres auteurs et institutions inversent ces définitons, les utilisent de manière indifférenciée ou rajoute le concept de répétabilité.
Barba, L. A. (2018). Terminologies for reproducible research. arXiv preprint arXiv:1802.03311.
Have a plan to organise, store, & make your files available
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Rows: 14
## Columns: 26
## $ plot_ID <chr> "BESSO-CF_Parcelle 1_2004", "BESSO-CF_Parcelle 2_…
## $ entity <chr> "Inprobois", "Inprobois", "Inprobois", "Inprobois…
## $ code_forest <chr> "BESSO-CF", "BESSO-CF", "BESSO-CF", "BESSO-CF", "…
## $ interview_date <dttm> 2021-10-20, 2021-10-20, 2021-10-20, 2021-10-20, …
## $ company <chr> "Inprobois", "Inprobois", "Inprobois", "Inprobois…
## $ status_forest <chr> "Forêt classée", "Forêt classée", "Forêt classée"…
## $ name_forest <chr> "BESSO", "BESSO", "BESSO", "BESSO", "BESSO", "BES…
## $ town <chr> "Adzopé", "Adzopé", "Adzopé", "Adzopé", "Adzopé",…
## $ village <chr> "Yakassé-Attobrou", "Yakassé-Attobrou", "Yakassé-…
## $ name_plot <chr> "Parcelle 1", "Parcelle 2", "Parcelle 3", "Parcel…
## $ years_install <chr> "2004", "2005", "2004", "2004", "2005", "2008", "…
## $ age <dbl> 17, 16, 17, 17, 16, 13, 15, 15, 12, 12, 12, 22, 2…
## $ area <dbl> 5.00, 20.00, 5.45, 4.83, 5.44, 5.16, 18.00, 6.84,…
## $ species_1 <chr> "KOTO", "TIAMA", "KOTO", "BETE", "TIAMA", "TECK",…
## $ species_2 <chr> "BETE", "FROMAGER", "KAPOKIER", "FROMAGER", "FROM…
## $ species_3 <chr> NA, NA, NA, NA, NA, NA, NA, NA, "MELINA", NA, NA,…
## $ species_4 <chr> NA, NA, NA, NA, NA, NA, NA, NA, "TIAMA", NA, NA, …
## $ species_5 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "AKO"…
## $ species_6 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "ILOM…
## $ species_7 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "AZOD…
## $ no_species <chr> "2", "2", "2", "2", "2", "2", "2", "2", "4", "2",…
## $ association <chr> "KOTO_ BETE", "TIAMA_ FROMAGER", "KOTO_ KAPOKIER"…
## $ system_culture <chr> "line/line", "line/line", "line/line", "line/line…
## $ spacing_species_line <chr> "5/10", "5/10", "5/10", "5/10", "5/10", "3/3", "1…
## $ ratio_species <chr> "50/50", "50/50", "50/50", "50/50", "50/50", "50/…
## $ density_stem_ha <chr> "200", "200", "200", "200", "200", "1100", "200",…
Markdown est un langage de balisage léger créé en 2004 par John Gruber avec l’aide d’Aaron Swartz. Il a été créé dans le but d’offrir une syntaxe facile à lire et à écrire.
R Markdown offre une syntaxe simplifiée pour mettre en forme des documents contenant à la fois du texte, des instructions R et le résultat fourni par R lors de l’évaluation de ces instructions. En ce sens, il s’agit d’un outil permettant de produire des rapports d’analyse détaillés et commentés, plutôt que de simples scripts R incluant quelques commentaires.
Une étude sur la mise en place d’une clé dichotomique à partie d’un jeu de données d’inventaires floristique réalisée dans la forêt classée de la Téné (Oumé, Côte d’Ivoire).
Evaluation FTA sur les plantations en mélagne.
la moyenne d’âge varie d’une catégorie à une autre ?
##
## Shapiro-Wilk normality test
##
## data: HepatitisCdata$ALP
## W = 0.76867, p-value < 2.2e-16
Nous pouvons à présent procéder à la comparaison des groupes.
Nous sommes partie de l’hypothèse H0 que la valeur en albumine est la même dans les deux catégorie (M, F). Puisque 1.417082310^{-7} <0.05, nous rejetons l’hypothèse H0 pour accepter l’hypothèse alternative. Conclusion, la valeur médiane d’albumine de l’homme est significativement différente à celle de la femme.
Il est possible d’afficher des valeur spécifiques dans le texte
La valeur moyenne chez la femme est de 40.5592437 et chez l’homme ?
Que dire de la variable Categorie et le taux d’ALB
##
## Kruskal-Wallis rank sum test
##
## data: ALB by Category
## Kruskal-Wallis chi-squared = 70.072, df = 4, p-value = 2.192e-14
Nous pouvons donc procéder à une post-hoc de 1001 manières après avoir déterminer la médiane dans de chaque catégorie.
Tests post-hoc
## [1] "1=Hepatitis" "0=Blood Donor" "2=Fibrosis"
## [4] "3=Cirrhosis" "0s=suspect Blood Donor"
## ## FSA v0.9.1. See citation('FSA') if used in publication.
## ## Run fishR() for related website and fishR('IFAR') for related book.
## Comparison Z P.unadj P.adj
## 1 0=Blood Donor - 0s=suspect Blood Donor 3.5219561 4.283750e-04 8.567501e-04
## 2 0=Blood Donor - 1=Hepatitis -1.8540331 6.373442e-02 9.104917e-02
## 3 0s=suspect Blood Donor - 1=Hepatitis -4.0198279 5.824067e-05 1.456017e-04
## 4 0=Blood Donor - 2=Fibrosis 0.5346925 5.928625e-01 6.587361e-01
## 5 0s=suspect Blood Donor - 2=Fibrosis -2.7975049 5.149898e-03 8.583164e-03
## 6 1=Hepatitis - 2=Fibrosis 1.6928492 9.048417e-02 1.131052e-01
## 7 0=Blood Donor - 3=Cirrhosis 7.3294542 2.310919e-13 2.310919e-12
## 8 0s=suspect Blood Donor - 3=Cirrhosis 0.1370035 8.910280e-01 8.910280e-01
## 9 1=Hepatitis - 3=Cirrhosis 6.4665675 1.002541e-10 5.012703e-10
## 10 2=Fibrosis - 3=Cirrhosis 4.4623858 8.105213e-06 2.701738e-05
## Group Letter MonoLetter
## 1 BloodDonor a a
## 2 ssuspectBloodDonor b b
## 3 1Hepatitis a a
## 4 2Fibrosis a a
## 5 3Cirrhosis b b
Autres méthodes
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot compute
## exact p-value with ties
##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: HepatitisCdata$ALB and HepatitisCdata$Category
##
## 1=Hepatitis 0=Blood Donor 2=Fibrosis 3=Cirrhosis
## 0=Blood Donor 0.1637 - - -
## 2=Fibrosis 0.1637 0.5702 - -
## 3=Cirrhosis 9.8e-08 1.2e-12 5.1e-06 -
## 0s=suspect Blood Donor 0.0158 0.0048 0.0241 0.0369
##
## P value adjustment method: holm
## Dunn (1964) Kruskal-Wallis multiple comparison
## p-values adjusted with the Holm method.
## Comparison Z P.unadj P.adj
## 1 0=Blood Donor - 0s=suspect Blood Donor 3.5219561 4.283750e-04 2.570250e-03
## 2 0=Blood Donor - 1=Hepatitis -1.8540331 6.373442e-02 2.549377e-01
## 3 0s=suspect Blood Donor - 1=Hepatitis -4.0198279 5.824067e-05 4.076847e-04
## 4 0=Blood Donor - 2=Fibrosis 0.5346925 5.928625e-01 1.000000e+00
## 5 0s=suspect Blood Donor - 2=Fibrosis -2.7975049 5.149898e-03 2.574949e-02
## 6 1=Hepatitis - 2=Fibrosis 1.6928492 9.048417e-02 2.714525e-01
## 7 0=Blood Donor - 3=Cirrhosis 7.3294542 2.310919e-13 2.310919e-12
## 8 0s=suspect Blood Donor - 3=Cirrhosis 0.1370035 8.910280e-01 8.910280e-01
## 9 1=Hepatitis - 3=Cirrhosis 6.4665675 1.002541e-10 9.022865e-10
## 10 2=Fibrosis - 3=Cirrhosis 4.4623858 8.105213e-06 6.484170e-05
## PMCMR is superseded by PMCMRplus and will be no longer maintained. You may wish to install PMCMRplus instead.
## Warning in kwAllPairsNemenyiTest.default(x = HepatitisCdata$ALB, g =
## HepatitisCdata$Category, : Ties are present, p-values are not corrected.
##
## Pairwise comparisons using Tukey-Kramer-Nemenyi all-pairs test with Tukey-Dist approximation
## data: HepatitisCdata$ALB and HepatitisCdata$Category
## 1=Hepatitis 0=Blood Donor 2=Fibrosis 3=Cirrhosis
## 0=Blood Donor 0.34256 - - -
## 2=Fibrosis 0.43838 0.98377 - -
## 3=Cirrhosis 1.0e-09 2.4e-12 7.9e-05 -
## 0s=suspect Blood Donor 0.00056 0.00392 0.04118 0.99992
##
## P value adjustment method: single-step
## alternative hypothesis: two.sided
## Warning: package 'pgirmess' was built under R version 4.1.2
## Warning: multiple methods tables found for 'direction'
## Warning: multiple methods tables found for 'gridDistance'
## Multiple comparison test after Kruskal-Wallis
## p.value: 0.05
## Comparisons
## obs.dif critical.dif difference
## 1=Hepatitis-0=Blood Donor 68.62555 103.90516 FALSE
## 1=Hepatitis-2=Fibrosis 89.72619 148.78863 FALSE
## 1=Hepatitis-3=Cirrhosis 316.53161 137.40802 TRUE
## 1=Hepatitis-0s=suspect Blood Donor 306.29762 213.89713 TRUE
## 0=Blood Donor-2=Fibrosis 21.10064 110.77975 FALSE
## 0=Blood Donor-3=Cirrhosis 247.90606 94.94767 TRUE
## 0=Blood Donor-0s=suspect Blood Donor 237.67207 189.43622 TRUE
## 2=Fibrosis-3=Cirrhosis 226.80542 142.67738 TRUE
## 2=Fibrosis-0s=suspect Blood Donor 216.57143 217.31970 FALSE
## 3=Cirrhosis-0s=suspect Blood Donor 10.23399 209.69206 FALSE
Description d’un jeu de données d’une analyse d’échantillon de sol prélévé à Chernobyle
Explorer le jeu de données
## Rows: 33
## Columns: 32
## $ N <chr> "G017603", "G017601", "G0176…
## $ Site <chr> "Inner sampling area", "Inne…
## $ GP_Point <chr> "801", "802", "803", "804", …
## $ Latitude <chr> "51.380494", "51.379942", "5…
## $ Longitude <chr> "30.024562", "30.025445", "3…
## $ Date_Soil_sampled <chr> "May_June_2014", "May_June_2…
## $ Dose_rate_microSv_per_hour_measurement_1 <chr> "4.66", "6.48", "6.53", "5.3…
## $ Dose_rate_microSv_per_hour_measurement_2 <chr> "4.85", "6.52", "6.48", "4.7…
## $ Dose_rate_microSv_per_hour_measurement_3 <chr> "4.68", "6.65", "6.38", "4.9…
## $ Dose_rate_microSv_per_hour_measurement_4 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Dose_rate_microSv_per_hour_measurement_5 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Dose_rate_microSv_per_hour_measurement_6 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Dose_rate_microSv_per_hour_measurement_7 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Dose_rate_microSv_per_hour_measurement_8 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Dose_rate_microSv_per_hour_measurement_9 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Dose_rate_microSv_per_hour_measurement_10 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Dry_mass_of_soil_in_grams <chr> "10", "10.05", "10", "10.01"…
## $ Cs.137_Soil_Bq_per_sample_DM <chr> "7.75E+01", "5.76E+02", "2.3…
## $ error_Cs.137_Bq_per_Sample_DM_P.0.95 <chr> "1.02E+01", "7.62E+01", "3.1…
## $ Am.241_Soil_Bq_per_sample_DM <chr> "n/a", "n/a", "8.01E+00", "2…
## $ error_Am.241_Bq_per_Sample_DM_P.0.95 <chr> "n/a", "n/a", "3.10E+00", "7…
## $ Sr.90_Soil_Bq_per_sample_DM <chr> "4.91E+01", "3.32E+02", "1.3…
## $ error_Sr.90_Bq_per_Sample_DM_P.0.95 <chr> "1.04E+01", "7.05E+01", "2.9…
## $ Pu.239_240_Soil_Bq_per_sample_DM <chr> "n/a", "n/a", "n/a", "n/a", …
## $ error_Pu.239_240_Bq_per_Sample_DM_P.0.95 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Pu.238_Soil_Bq_per_sample_DM <chr> "n/a", "n/a", "n/a", "n/a", …
## $ error_Pu.238_Bq_per_Sample_DM_P.0.95 <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Cs.137_Soil_Bq_kg_DM <dbl> 7750, 57300, 23800, 51900, 2…
## $ Sr.90_Soil_Bq_kg_DM <dbl> 4910, 33100, 13700, 13400, 6…
## $ Am.241_Soil_Bq_kg_DM <chr> "n/a", "n/a", "8.01E+02", "2…
## $ Pu.239_240_Soil_Bq_kg_DM <chr> "n/a", "n/a", "n/a", "n/a", …
## $ Pu.238_Soil_Bq_kg_DM <chr> "n/a", "n/a", "n/a", "n/a", …
Comparer des paramètres selon le site
##
## Shapiro-Wilk normality test
##
## data: Chernobyl_npp_soil$Dose_rate_microSv_per_hour_measurement_1
## W = 0.89552, p-value = 0.009
##
## Kruskal-Wallis rank sum test
##
## data: Dose_rate_microSv_per_hour_measurement_1 by Site
## Kruskal-Wallis chi-squared = 5.8276, df = 1, p-value = 0.01578
As the p-value is less than the significance level 0.05, we can conclude that there are significant differences between the treatment groups.
L’analyse peut se poursuivre avec un test post-hoc
##
## Pairwise comparisons using Wilcoxon rank sum exact test
##
## data: Chernobyl_npp_soil$Dose_rate_microSv_per_hour_measurement_1 and Chernobyl_npp_soil$Site
##
## Inner sampling area
## Pine tree 0.0098
##
## P value adjustment method: holm
##
## Pairwise comparisons using Wilcoxon rank sum exact test
##
## data: Chernobyl_npp_soil$Dose_rate_microSv_per_hour_measurement_1 and Chernobyl_npp_soil$Site
##
## Inner sampling area
## Pine tree 0.0098
##
## P value adjustment method: holm